Addressing Malicious Noise in Clickthrough Data

نویسنده

  • Filip Radlinski
چکیده

Clickthrough logs are becoming an increasingly used source of training data for learning ranking functions. Due to the large impact that the position in search results has on commercial websites, malicious noise is bound to appear in search engine click logs. We present preliminary work in addressing this form of noise, that we term click-spam. We analyze click-spam from a utility standpoint, and investigate the idea of whether personalizing web search results by partitioning the user population can reduce or eliminate the financial incentives for potential spammers. We formalize click-spam and analyze the incentives for malicious agents, then investigate the model with some examples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Relevance Prediction by Addressing Biases and Sparsity in Web Search Click Data

In this paper, we present our approach and findings in participating the 2012 Yandex Relevance Prediction Challenge. Our approach has two goals: on one hand, we aim to address four types of biases, namely, position-bias, perception-bias, query-bias, and session-bias to better interpret the clickthrough information; on the other hand, we aim to address the clickthrough sparsity by exploiting var...

متن کامل

CWI at the Photo Retrieval Task of ImageCLEF 2009

CWI’s experiments investigate the usefulness of clickthrough data for improving the diversity of image retrieval results. We use the search logs provided to us by Belga to find relevant images; we consider that these correspond to images clicked for queries exactly matching or best matching a topic’s title and cluster titles. To reduce the noise, we also filter these results and only consider t...

متن کامل

Query Session Data vs. Clickthrough Data as Query Suggestion Resources

Query suggestion has become one of the most fundamental features of Web search engines. Some query suggestion algorithms utilize query session data, while others utilize clickthrough data. The objective of this study is to examine which of these two resources can provide more effective query suggestions. Our results show that query session data outperforms clickthrough data in terms of clickthr...

متن کامل

Learning Phrase-Based Spelling Error Models from Clickthrough Data

This paper explores the use of clickthrough data for query spelling correction. First, large amounts of query-correction pairs are derived by analyzing users' query reformulation behavior encoded in the clickthrough data. Then, a phrase-based error model that accounts for the transformation probability between multi-term phrases is trained and integrated into a query speller system. Experiments...

متن کامل

Spying Out Accurate User Preferences for Search Engine Adaptation

Most existing search engines employ static ranking algorithms that do not adapt to the specific needs of users. Recently, some researchers have studied the use of clickthrough data to adapt a search engine’s ranking function. Clickthrough data indicate for each query the results that are clicked by users. As a kind of implicit relevance feedback information, clickthrough data can easily be coll...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007